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Gene expression is inherently noisy as many steps in the read-out of the genetic information are 
stochastic. To disentangle the effect of different sources of stochasticity in such systems, we consider 
various models that describe some processes as stochastic and others as deterministic. We review 
earlier results for unregulated (constitutive) gene expression and present new results for a gene 
controlled by negative autoregulation with cell growth modeled by linear volume growth. 
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I. INTRODUCTION 

Mathematical and physical methods and concepts are 
increasingly used in the life sciences. For example, the 
dynamics of gene regulatory circuits is often studied 
by describing these circuits with simple but non-trivial 
mathematical models 0-0]]. An important issue is then 
the choice of an appropriate level of mathematical de- 
scription. Cells are very dynamic and often adapt to the 
external conditions by changing their global properties, 
which may in turn affect the function of genetic circuits 
hosted by that cell Thus one needs to address the 

question of how one can mathematically model such a 
dynamic system, at least under constant external condi- 
tions. Moreover, every individual cell grows and divides 
while the circuits it contains perform their programmed 
functions. Many models use a mean-field-like description 
averaging over these processes, but it is often not clear 
how accurate such approximations are. For example, the 
gene copy number is often described by an average and 
the actual doubling of the gene during the cell division 
cycle is not considered. 

Yet another issue is whether the description of such 
a regulatory circuit should be deterministic or stochas- 
tic. Many important molecules are present in a cell in 
low copy numbers. Hence, fluctuations can be expected 
to be important, so that a stochastic description of gene 
expression is necessary [9l4ll|. These effects,which have 
been studied theoretically for a long time 0, Il2l - fl9j , have 
recently become accessible to direct quantitative experi- 
ments thanks to the development of single cell approaches 

During the division cycle of a cell, stochasticity arises 
from different sources and at different points, namely 
from the inherent stochasticity of the synthesis of pro- 
teins (which occurs throughout the division cycle) and 
from the partitioning of the protein molecules among the 
daughter cells during cell division (an approximately in- 
stantaneous event). An obvious question that arises in 
this context is whether there is a dominant source of 
noise? This question is related to the problem of which 
mathematical description is most appropriate: Which 
sources of noise need to be taken into account for a min- 



imal, but realistic description? Do different descriptions 
of the noise, with or without explicit volume growth, 
with explicit or implicit cell division etc. lead to approxi- 
mately the same predictions or are there considerable dif- 
ferences between these descriptions concerning the noise 
that is generated? In a recent study [23|, we have ad- 
dressed some of these issues by considering various simple 
models that include or exclude certain sources of noise. 
The comparison of these results has shown that often 
there is no dominant source of noise, i.e. that different 
sources contribute comparably (an exception is so-called 
bursty protein synthesis: if many proteins are produced 
from relatively rare transcription events, this bursting is 
clearly the dominant source of noise). The absence of a 
dominant noise source means that on the one hand, all 
sources have to be included for accurate results, but, on 
the other hand, also that omitting any of those sources 
will still lead to fluctuations of the same order of magni- 
tude. 

In our previous study, these questions have been stud- 
ied for unregulated genes. Here we extend our approach 
to regulated genes. We focus on a simple, but im- 
portant regulatory system, namely negative autoregula- 
tion, where the protein product of a gene controls the 
read-out of that gene, such that large concentrations of 
the protein suppress further synthesis of that protein 
[IH EH HH • Fluctuations arising from both sources we 
consider (stochastic synthesis and stochastic partition- 
ing during cell division) are found to be suppressed sub- 
stantially by the negative feedback. A complication that 
arises generally for regulated genes is that regulation de- 
pends on protein concentrations, which in turn depend on 
the cell volume. This means that the growth of cellular 
volume needs to be taken into account explicitly, which 
was not necessary for unregulated genes that could be 
described fully by the number of protein molecules per 
cell. 

The paper is organized as follows: In section [Til wc re- 
view some key results from our previous study [23j com- 
paring different sources of noise for unregulated gene ex- 
pression. An alternative analytical derivation of one cen- 
tral result is presented in the appendix. In section Mil 
this type of analysis is extended to a gene controlled by 
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negative autorcgulation. 
remarks. 



We end with some concluding 



II. SOURCES OF STOCHASTICITY FOR 
CONSTITUTIVE GENE EXPRESSION 

Recently we studied different models for the stochas- 
tic gene expression of an unregulated (constitutively ex- 
pressed gene) in order to disentangle different sources of 
(intrinsic) stochasticity [23| . Stochasticity arises from the 
process of protein synthesis and also from degradation, if 
the proteins are unstable. When a cell divides, the par- 
titioning of proteins among daughter cells also generates 
fluctuations. In our recent work [23j . we analyzed these 
different sources in a systematic way to see which sources 
contribute to the observed noise and whether there is a 
dominant source. In this section, we briefly summarize 
some key results obtained with these models. 

Protein synthesis is a two-step process consisting of 
transcription and translation. In the first step, a gene 
sequence is transcribed into mRNA and then it is trans- 
lated by ribosomes to produce proteins. If M and P 
represent the mRNA and protein copy numbers, respec- 
tively, their time evolution is described by: 



M 
P 



a m g - 
a p M 



(1) 



where a m , ap and (3 m , j5 p are synthesis and degrada- 
tion rates of mRNAs and proteins respectively g is the 
gene copy number. In bacteria, proteins are often stable 
(with lifetimes long compared to the generation time T) 
[28f . Then the degradation rate in Eq. [2] is an effective 
degradation rate representing dilution by cell growth and 
division with (3 = In2/T. By contrast mRNA is typically 
rather short-lived with lifetimes in the range of a few min- 
utes [22l [27j and one can approximate the equation for 
M by its steady state, M = a m g/ 'f3 m . In that case the 
above two-step process is reduced to an effective one-step 
process: 



P = ag- /3P, 



(2) 



with a = a p a m / /3 m . We would like to note that when 
mRNA is treated as a fast variable and considered to be 
in a steady state, one obtains a correct description of the 
average protein number, but the fluctuations are under- 
estimated, in particular, if a single transcription event 
(or one mRNA molecule) gives rise to many copies of the 
protein. This effect, where the output of transcription 
is strongly amplified by translation, is known as bursty 
protein synthesis and will not be considered here. The 
reader is referred to ref. || [l|| [23| for discussions of this 
issue 

We now consider different models 



23] 



that are based 
on Eq. but describe cell division explicitly. In that 
case, the degradation rate for stable proteins is /3 = 
and the protein copy number per cell is divided by 2 at 
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FIG. 1. Stochastic models of unregulated protein synthesis: 
(a)-(c) Trajectories of the protein copy number from stochas- 
tic simulations with stochastic synthesis, partitioning during 
cell division, or both, all with cell division modeled explicitly. 
(d)-(f) Corresponding concentrations of the protein calculated 
for a volume that increases linearly during the division cycle 
and is halved at multiples of the division time T. Here the 
cell volume does not affect the protein synthesis rate does. 
The parameter values used for these plots are a = 0.5/min, 
T = 40 min. 
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FIG. 2. Stochastic models of protein synthesis: Noise strength 
T] 2 as a function of the average protein copy number (P) (var- 
ied by varying the synthesis rate a) for the different mod- 
els (for the models with explicit cell division, averages over 
cell immediately after division are plotted, i.e. r/o and (Po))- 
T — 40 min. 



cell division. We start with a model where protein syn- 
thesis is described deterministically, while proteins are 
distributed stochastically among the two daughter cells 
during cell division (we note that in all our models a 
cell divides in exactly two daughter cells and we look at 
only one lineage of cells; for some more complex cases 
see, e.g. [2(|). So during division each protein molecule 
has a probability r = 1/2 to go to either of the daugh- 
ter cells. This means that in every generation a constant 
number Q = aT of protein molecules are synthesized, 
but the initial protein number in each cycle fluctuates 
due to the stochastic division. Figure. Q^a) shows a time 
series of such a process as obtained from simulations. For 
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this case we have obtained a number of analytical results 
[23} using a method proposed in Ref. [29[. An alter- 
native derivation based on generating functions is given 
in Appendix [A] The average copy number after division 
and the variance of that number are found to be given by 
(Pa) = Q = ctT and 5Pq = 2Q/3, respectively. Two com- 
monly used characteristics of noise are the noise strength 

2 
T 



• 2 defined as 



T 



((P-(P)) 2 ) 
(P) 2 



(3) 



and the Fano factor F — r] 2 (P). rj 2 typically scales as 
rf ~ 1/ (P) , so the latter parameter provides a charac- 
terization of the pre-factor of that scaling. For the case 
under consideration, we obtain 



v 2 



3(Po) 



(4) 



or Fq = 2/3 (the index '0' in these expressions indicates 
that we have taken averages over a population of cells 
immediately after division). 

In the complementary case, synthesis of proteins is 
stochastic and division deterministic. So when a cell di- 
vides each daughter cell gets exactly half of the available 
proteins as shown in figure [lib) (f° r °dd number of pro- 
tein P, we take the number after division to be cither 
(P + l)/2 or (P - l)/2, each with probability 1/2, thus 
leading to a minimal remnant of stochasticity in our oth- 
erwise deterministic description of cell division). We also 
assume the synthesis rate to be constant and do not ex- 
plicitly describe gene duplication. We then obtain 



(P ) = aT, SP 2 



aT 
~3~ 



and 



1 



3<P)) 



(5) 

The last result implies that the Fano factor is Fo = 1/3, 
which is just half of what we have seen for stochastic 
partitioning (Eq. [4]). 

Finally, we combine both sources of stochasticity, thus 
synthesis as well as degradation of proteins occur stochas- 
tically (figure [Ip) . Using again the method of Ref. [29| , 
we obtain 



(Po> = aT, <5P 2 = aT, 



and 



l/(^o). 



(6) 

Points to be noted are: (i) Additive independent noise 
strengths {rf). In our case, the noise in Eq. [5] is the 
sum of the noise components for stochastic partitioning 
(2/(3Po)) and from stochastic synthesis (l/(3Po)). (ii) 
The contributions from both sources of noise are of the 
same order of magnitude, implying that there is no dom- 
inant source of noise in this simple case. 

In figurcQTd)-(f) we show time series for the concentra- 
tions of the protein for the three cases discussed above. 
The concentration fluctuates around its mean, and shows 
no systematic dependence on the cell division cycle. The 
latter observation arises form the fact that both the vol- 
ume and the protein number increase (on average) lin- 
early during the cycle. A systematic variation over the 
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FIG. 3. Stochastic models of protein synthesis with nega- 
tive auto-regulation: (a)-(c) Trajectories of the protein copy 
number from stochastic simulations with stochastic synthesis, 
cell division, or both, all with cell division modeled explic- 
itly and linear volume growth, (d)-(f) Corresponding concen- 
tration of the protein. The parameter values used for these 



plots are a?o = 2.0/min, Vo 
molecules /fim 3 . 
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FIG. 4. Noise strength 770 as a function of the average 
protein copy number (Po) (varied by varying the synthesis 
rate ao) for the different models with negative autoregulation. 
Averages are taken over cells immediately after cell division. 
The parameters values are Vo = 2fim , T = 50 min, K — 100 
molecules /fim 3 . 



course of the division cycle is obtained if an explicit de- 
scription of gene duplication is included or if the volume 
growth is not linear [23| . 



III. PROTEIN SYNTHESIS WITH NEGATIVE 
AUTOREGULATION 

Gene regulation is incorporated into models of the type 
of Eq. [5] via synthesis (or degradation) rates that depend 
on the concentration of a regulatory protein, for example 
a transcription factor. Here we consider one specific case, 
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namely negative autoregulation, where the output of a 
gene (the protein product) modifies the read-out of that 
gene in such a way that the synthesis of a protein is sup - 
pressed by a high concentration of that protein [l6|, Hfl- 
|32| . The dependence of the synthesis rate on the protein 
concentration p = P/V is expressed by a so-called Hill 
function 



Here ao is the maximal synthesis rate, K is a concen- 
tration scale that defines which protein concentration is 
required to affect the synthesis rate (in the simplest case, 
it is given by the dissociation constant for binding of a 
transcription factor to its binding site on the DNA). n 
is called the Hill coefficient, which describes the cooper- 
ativity of regulation and characterizes the steepness of 
the regulation function. In the following we take n to be 
equal to 2. 

The synthesis rate a(p) in Eq. ([7]) is time-dependent 
through both the protein copy number (which changes in 
discrete steps of synthesis and degradation) and the cell 
volume (which changes in a continuous fashion). So in 
contrast to the unregulated case discussed before, volume 
growth affects the dynamics of the protein synthesis pro- 
cess in the presence of (concentration-dependent) gene 
regulation. In our case, volume growth is taken to be 
linear in time, starting from an initial volume Vq directly 
after cell division and reaching 2Vq just before the next 
division. The growth is implemented via a discrete time 
step At in which the volume increases by AV. The vol- 
ume is halved exactly at the division. We are again inter- 
ested in the different sources of stochasticity and consider 
the synthesis of protein and the partitioning of molecular 
content to be either stochastic or deterministic. 

We begin with the case where both synthesis and divi- 
sion are stochastic. The variation of the protein number 
versus time for this case is depicted in figure |3{c) , the 
corresponding concentration is shown in figure GDJf). As 
before, we consider the dependence of the noise param- 
eters r\ on the average protein number. In this case rfe 
follows a 1/ (Po)-bchavior for small (Po), but crosses over 
to 2/3{Pq) for large (Pq) (sec figure. 0] blue line with 
filled squares). The crossover occurs for values of (Pq) of 
the order of KVq, i.e. it occurs when the autoregulation 
mechanism becomes important. For smaller (Pq) (or ao), 
the system behaves like an unregulated system with con- 
siderable fluctuations due to protein synthesis as well as 
division. For large ao (or (Pq)) autoregulation becomes 
active and suppresses protein number fluctuations, so 
that protein synthesis becomes approximately determin- 
istic, but partitioning during division remains stochastic, 
hence leading to the observed 2/3(Po)-behavior. 

Now we separate the two sources as we did before for 
the unregulated gene. In the first case, proteins are added 
deterministically and partitioned stochastically. Deter- 
ministic addition means integrating Eq. ([7]) over a cycle. 
This number depends on the initial protein number. The 
integration usually leads to a non integer value of the 
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FIG. 5. Noise strength of concentration as a function 
of the average protein number (P) (varied by varying the 
synthesis rate ao) for the different models with negative au- 
toregulation. Averages are taken over whole division cycle. 
The parameters values are Vo = 2/im 3 , T = 50 min, K — 100 
molecules / fim 3 . 



protein number, in such cases the remaining non inte- 
ger part is interpreted probabilistically and it is added 
with a probability equal to the fractional part. A trajec- 
tory of the number of molecules per cell for this case is 
depicted in figure Eta) and the corresponding concentra- 
tion is shown in figure EJd) . Our simulations show that 
in this case t]q behaves as 2/3(Po) for the entire range of 
(Po) as in the case without autoregulation (see figure |H 
black line with filled triangles). 

Finally we consider the case where the synthesis of 
the protein is a stochastic process, but partitioning dur- 
ing cell division is deterministic (see figure [3jb) for the 
protein number and figure [31(e) for corresponding concen- 
tration). In this case for small (Po), autoregulation does 
not kick in and the system behaves effectively as an un- 
regulated gene with t]q ~ 1/3(Pq). For large (Pq), both 
?7q and the Fano factor t]q x (Po) are strongly reduced, 
as the synthesis becomes almost deterministic for large 
(Po), where protein number fluctuations are suppressed 
by the negative autoregulation. (see figure [4] green line 
with filled circles). 

In an experiment, one typically looks at a population 
of cells, which are at different points in the division cycle. 
To address this situation we take averages of the protein 
number or concentration over many realization and over 
time through the full division cycle instead of over dif- 
ferent realizations, all taken directly after the division. 
In doing so, the average protein concentration is a more 
relevant parameter than the average protein number, be- 
cause the protein number increases two-fold over the divi- 
sion cycle. We therefore determine a noise parameter r/p 
for the concentration. We plot simulation results for the 
different cases in figure [3] When both synthesis and divi- 
sion are stochastic (blue line with filled squares in figure 
[5]), rjp retains the 1/ (P)-behavior for small (P) (where 
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the gene is effectively unregulated). For large (P), that 
is in the presence of negative autoregulation, the noise is 
suppressed and r)p shows a 1/3 (P) behavior. The other 
two cases, namely stochastic synthesis and deterministic 
division and vice- versa exhibit the same dependence. For 
both the noise is increased compared to the earlier case 
where averages were taken immediately after division. 
Here rj^ goes as ~ 1/2(P) for small (P) and ~ 1/6(P) for 
large (P) (see figure [5l black line with filled triangles and 
green line with filled circles). The latter results indicate 
that negative autoregulation suppresses noise from both 
sources (stochasticity of protein synthesis and stochas- 
tic partitioning during cell division). The observation 
that our results above (figure [4|) only showed suppression 
of the synthesis noise and not of the partitioning noise, 
is due to taking averages directly after division, which 
leaves no time to compensate for variations of the pro- 
tein concentration introduced during partitioning. 



IV. CONCLUSIONS 

In this paper, we have reviewed models for constitutive 
(unregulated) protein synthesis that we have studied pre- 
viously [23| and presented some new results for protein 
synthesis with negative autoregulation. In both cases, 
different model variants were considered to disentangle 
different sources of stochasticity, specifically stochastic 
synthesis of the protein and stochastic partitioning dur- 
ing cell division. 

The different models for unregulated gene expression 
show that there is no dominant source of stochasticity, 
as switching off one or the other source of noise leads to 
similar results (we note however that 'bursting' in pro- 
tein synthesis, which we did not discuss here, can be 
dominant @, EE SI)- We found similar behavior for 
the models with negative autoregulation and explicit lin- 
ear volume growth, but for these models the relation be- 
tween the noise parameter rf and the average protein 
number shows a crossover for protein concentrations at 
which autoregulation becomes important. Fluctuations 
are suppressed by negative autoregulation, as expected 
for such control systems and known from previous stud- 
ies [USE]. Specifically, our results show that fluctua- 
tions arising from both sources (stochastic synthesis and 
stochastic partitioning) arc suppressed by negative au- 
toregulation, as shown by the approximately 3-fold re- 
duction in the Fano factor for large values of (P), where 
the autoregulatory mechanism becomes active. 
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FIG. 6. Depiction of the model with deterministic synthe- 
sis and stochastic partitioning in cell division. We start with 
No — Q particles, At every step (generation) the cell divides 
into two daughter cells and each proteins goes to one of the 
daughter cells with probability r = 1/2. Between two divi- 
sions, a constant number Q of proteins is added, correspond- 
ing to a synthesis rate a = Q/T. Only the rightmost branch 
of the tree diagram, i.e. one lineage of cells, is considered. 



Appendix A: Calculation of moments for stochastic 
partitioning using generating functions 

The case of deterministic addition of Q molecules dur- 
ing the cell division cycle and stochastic partitioning dur- 
ing cell division, depicted in figure [HI ca n be solved using 
the method of generating functions [3j]. In this case the 
number of protein molecules will follow a binomial dis- 
tribution at the time of cell division with parameters Q 
and r. Thus for the generating function we have 

g{s) = (sr + 1 -r) Q . (Al) 

Let X and Y be two binomially distributed random vari- 
ables. Let gx\yi v i) be the generating function of the 
variable X conditioned to the fixed value of the variable 
Y such that 

9x\ v (v 1 ) = (v 1 r+l-r) v , (A2) 

and let gy(vo) be the generating function for the variable 
Y, such that 

g Y (v ) = (v r + l-r) N . (A3) 
Then we can write 

9x(vi) = J29x\ y (vi)Pr{Y = y) 

y 

= (v r + l-r) N \ . (A4) 

v u I \vq = vir+1 — r v ' 

Now let N\ be the random number of molecules in box 
1 (see figurcx[6]). Its probability distribution is given by: 

V(N 1 = k\Q)= ( Q \r k {l-r)Q- k . (A5) 
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Now in the next generation we know that Q particles are 
added and then distributed randomly between daughter 
cells. Thus in the next generation we will need to divide 
N\ + Q number of molecules with the distribution 



V(N 2 = fc|JVi + Q) = 



= (l-r) 



JVi- 



1A6) 



Then the unconditional probability V(N2 = k) is given 
by 

Q 

T(N 2 = k) = £>(JV 2 = k\j + Q) ViN, = j|QXA7) 

j=o 

Switching back to generating function and taking into 
account that y = N\ + Q, we find that 



Q 



~\_{v 3 r + 1 — r) 

ij=0 



, (A8) 



VQ—vir+l — r 

which can be generalized to m subdivisions leading to 

Q 



9m{v m -i) = ( \ (vjr + 1 - ■ 

3=0 



Vj=Vj + 1 r+l-r 



This expression is enough to evaluate the moments. Let 
us call the product in the bracket F m then g m = and 
let v = u m _i. Also, we have 



dv„ 



dv 

for 1 < j < m . This derivative is equivalent to 

dvi 



(A10) 



„m— 1— I 



dv 



for < I < m — 1. Then we have 
in — i 



dg m 
dv 



= E 



Qr dvj 



Vjr + 1 — r dv 
j=o J 



F% (A12) 



and for the second derivative with 



dv 2 



0, we have 



d9 2 m 
dv 2 



pQ 



m— 1 



E 



(^j 7, + 1 - r ) 2 



m — 1 



J=0 

dvj 



\ ^ K^T UVj j | \ - 

f-fw + l-r * I \ ^ 



dv 
Qr 



dvi 



Ufcr + 1 — r 



(A13) 



both of which need to be evaluated at Vj = 1. We get 



(l-r m ), 



(A14) 



u= i 1 — r 

which in the limit m — > oo and r = 1/2 leads to the 
average value E^nlQ] = Q. On the other hand the second 
derivative gives: 



( A9 ) dgl 



dv 2 



v=l 



Qr 2 
1 -r 2 



(1 - r 2m ) + 



Qr 
1-r 



(l-r m ) 



(A15) 



which for m — > cxd and r = 1/2 gives E^r^ri — 1)|Q] = 
Q 2 — Q/3. Finally we obtain the variance as: 

Var[n\Q] = E[n{n - 1)\Q] + E[n\Q] - E[n\Q] 2 = | Q, 

(A16) 



(All) which gives rj 2 = 2/3Q. 
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